Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            In an era of information overload, research writing, particularly literature review composition, has become increasingly burdensome due to the sheer volume of scholarly publications released each year. This paper introduces {\em WriteAssist}, a novel standalone authoring system that helps researchers efficiently generate literature review sections. Given the title and abstract of a work-in-progress manuscript, WriteAssist automatically retrieves relevant and recent peer-reviewed articles, highlighting portions that offer supporting or contrasting perspectives. A key innovation lies in its personalized recommendation engine, which tailors results based on the user's prior publications and research profile, enabling context-aware synthesis. We position WriteAssist within the landscape of intelligent writing assistants, academic search platforms, and personalized recommender systems, and we detail its architecture -- integrating natural language processing and user modeling to streamline academic writing. The system represents a significant step toward alleviating cognitive overload in scholarly composition and offers a blueprint for smarter, adaptive tools in academic research support.more » « lessFree, publicly-accessible full text available September 15, 2026
- 
            Scientific workflows are pivotal for managing complex computational tasks, including data analysis, processing, simulation, and visualization. However, their design and administration typically demand substantial programming expertise, limiting access for domain scientists. Many such workflow systems also lack real-time execution tracking, and streamlined data integration capabilities, hindering efficiency and repeatability in scientific experimentation. In response, we introduce VisFlow 2.0, a next-generation platform derived from the original VisFlow. We compare VisFlow 2.0 to traditional alternatives through a well-studied computational pipeline, highlighting its usability, flexibility, and effectiveness, especially for non-expert users.more » « lessFree, publicly-accessible full text available June 23, 2026
- 
            A vast proportion of scientific data remains locked behind dynamic web interfaces, often called the deep web—inaccessible to conventional search engines and standard crawlers. This gap between data availability and machine usability hampers the goals of open science and automation. While registries like FAIRsharing offer structured metadata describing data standards, repositories, and policies aligned with the FAIR (Findable, Accessible, Interoperable, and Reusable) principles, they do not enable seamless, programmatic access to the underlying datasets. We present FAIRFind, a system designed to bridge this accessibility gap. FAIRFind autonomously discovers, interprets, and operationalizes access paths to biological databases on the deep web, regardless of their FAIR compliance. Central to our approach is the Deep Web Communication Protocol (DWCP), a resource description language that represents web forms, HyperText Markup Language (HTML) tables, and file-based data interfaces in a machine-actionable format. Leveraging large language models (LLMs), FAIRFind combines a specialized deep web crawler and web-form comprehension engine to transform passive web metadata into executable workflows. By indexing and embedding these workflows, FAIRFind enables natural language querying over diverse biological data sources and returns structured, source-resolved results. Evaluation across multiple open-source LLMs and database types demonstrates over 90% success in structured data extraction and high semantic retrieval accuracy. FAIRFind advances existing registries by turning linked resources from static references into actionable endpoints, laying a foundation for intelligent, autonomous data discovery across scientific domains.more » « lessFree, publicly-accessible full text available July 26, 2026
- 
            Biologists often set out to find relevant data in an ever-changing landscape of interesting databases. While leading journals publish descriptions of databases, they are usually not recent and do not frequently update the list that discards defunct or poor-quality databases. These indices usually include databases that are proactively requested to be included by their authors. The challenge for individual biologists, then, is to discover, explore, and select databases of interest from a large unorganized collection and effectively use them in their analysis without too large of an investment. The advocation of the FAIR data principle to improve searching, finding, accessing, and inter-operating among these diverse information sources in order to increase usability is proving to be a difficult proposition and consequently, a large number of data sources are not FAIR-compliant. Since linked open data do not guarantee FAIRness, biologists are now left to individually search for information in open networks. In this paper, we propose SoDa, for intelligent data foraging on the internet by biologists. SoDa helps biologists to discover resources based on analysis requirements and generate resource access plans, as well as storing cleaned data and knowledge for community use. SoDa includes a natural language-powered resource discovery tool, a tool to retrieve data from remote databases, organize and store collected data, query stored data, and seek help from the community when things do not work as anticipated. A secondary search index is also supported for community members to find archived information in a convenient way to enable its reuse. The features supported in SoDa endows biologists with data integration capabilities over arbitrary linked open databases and construct powerful computational pipelines using them, capabilities that are not supported in most contemporary biological workflow systems, such as Taverna or Galaxy.more » « lessFree, publicly-accessible full text available January 10, 2026
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
